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Virulence associated regions are searched for a long time in Mycobacterium. The present 
invention concerns the identification of 2 genomic regions which are shown to be 
associated with a virulent phenotype in Mycobacteria and particularly in M. tuberculosis 
and in M. leprae. It concerns also the fragments of said regions. 



The two regions are known as RD1 and RD5 as disclosed in Molecular Microbiology 
(1999), vol. 32, pages 643 to 655 (Gordon S.V. et aL). Both of these regions or at least 
one of them are absent from the vaccine strains of M. bovis BCG and in M. microti. 
strains found involved and used as live vaccines in the 1960's. 



Other applications which are encompassed by the present invention are related to the use 
of all or part of the said regions to detect virulent strains of Mycobacteria and 
particularly M. tuberculosis in humans and animals. The RD1 and RD5 are considered as 
virulence markers under the present invention. 



The recombinant Mycobacteria and particularly M. bovis BCG after modification of their 
genome by introduction of all part of RD1 region and/or RD5 region in said genome can 



be used for the immune system of patients affected with a cancer as for example 
bladder cancer. 



The present invention relates to a strain of M. bovis BCG or M. microti, wherein said 
5 strain has integrated part or all of the RD1 region responsible for enhanced 
immunogenicity to the tubercle bacilli, especially the genes coding ESAT-6 and CFP-10 
antigenes. These strains will be referred as theM bovis BCG::RD1 orM. microti::KDl 
strains and are useful as a new improved vaccine for prevention of tuberculosis 
infections an d for treating superficiaLh1adder_re>nfif»r._ 

Mycobacterium bovis BCG (bacille Calmette-Guerin) has been used since 1921 to 
prevent tuberculosis although it is of limited efficacy against adult pulmonary disease in 
highly endemic areas. Mycobacterium microti, another member of the Mycobacterium 
tuberculosis complex, was originally described as the infective agent of a tuberculosis- 
like disease in voles (Microtus agrestis) in the 1930*s (Wells, A. Q. 1937- Tuberculosis 
in wild voles. Lancet 1221 and Wells, A. Q. 1946. The murine type of tubercle bacillus. 
Medical Research council special report series 259:1-42.). Until recently, M. microti 
strains were thought to be pathogenic only for voles, but not for humans and some were 
even used as a live-vaccine. In fact, the vole bacillus proved to be safe and effective in 
preventing clinical tuberculosis in a trial involving roughly 10,000 adolescents in the UK 
in the 1950's (Hart, P. D. a., and I. Sutherland. 1977. BCG and vole bacillus vaccines in 
the prevention of tuberculosis in adolescence and early adult life. British Medical 
Journal 2:293-295). At about the same time, another strain, OV166, was successfully 
administered to half a million newborns in Prague, former Czechoslovakia, without any 
serious complications (Sula, L., and I. Radkovsky. 1976. Protective effects of M. microti 
vaccine against tuberculosis. J. Hyg. Epid. Microbiol. Immunol. 20:1-6). M. microti 



vaccination has since been discontinued because it was no more effective than the 
frequently employed BCG vaccine. As a result, improved vaccines are needed for 
preventing and treating tuberculosis. 

The problem for attempting to ameliorate this live vaccine is that the molecular 
5 mechanism of both the attenuation and the immunogenicity of BCG is still poorly 
understood. Comparative genomic studies of all six members of the M. tuberculosis 
complex have identified more than 140 genes, whose presence is facultative, that may 
confer differences in phenotype, host range and virulence. Relative to the genome of the 
paradigm strain, M. tuberculosis H37Rv (S. T. Cole, ei al., Nature 393, 537 (1998)), 
10 many of these genes occur in chromosomal regions that have been deleted from certain 
species (RD1-16, RvDl-5), M. A. Behr, et al., Science 284, 1520 (1999) ; R. Brosch, et , 
al., Infection Immun. 66, 2221 (1998) ; S. V. Gordon, et al., Molec Microbiol 32, 643 
(1999) ; H. Salamon, et al, Genome Res 10, 2044 (2000), G. G. Mahairas et al, J. 
Bacteriol. 178, 1274 (1996) and R. Brosch, et al., Proc Natl Acad Sci USA 99, 3684 
15 (2002). 

In connection with the invention and based on their distribution among tubercle bacilli 
and potential to encode virulence functions, RD1, RD3-5, RD7 and RD9 (Fig. 1A, B) 
were accorded highest priority for functional genomic analysis using "knock-ins" of M. 
bovis BCG to assess their potential contribution to the attenuation process. Clones 
20 spanning these RD regions were selected from an ordered M. tuberculosis H37Rv library 
of integrating shuttle cosmids (S. T. Cole, et al, Nature 393, 537 (1998) and W. R. 
Bange, et al, Tuber. LungDis. 79, 171 (1999)), and individually electroporated into BCG 
Pasteur, where they inserted stably into the attB site (M. H. Lee, et al, Proc. Natl. Acad. 
Sci. USA 88, 31 11 (1991)). 



25 We have uncovered that only reintroduction of RD1 led to profound phenotypic 
alteration. Strikingly, the BCG::RD1 "knock-in" grew more vigorously than BCG 
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controls in immuno-deficient mice, inducing extensive splenomegaly and granuloma 
formation. 

RD1 is restricted to the avirulent strains M. bovis BCG and M. microti. Although the 
endpoints are not identical, the deletions have removed from both vaccine strains a 
cluster of six genes (Rv3871-Rv3876) that are part of the ESAT-6 locus (Fig. 1A (S. T. 
Cole, et al., Nature 393, 537 (1998) and F. Tekaia, et al., Tubercle Lung Disease 79, 329 
(1999)). 

Among the missing products are members of the mycobacterial PE (Rv3872), PPE 
- -(Rv3873),-and-ESAT-6-<^ 

secretion signals, ESAT-6 (Rv3875) and the related protein CFP-10 (Rv3874), are 
abundant components of short-term culture filtrate, acting as immunodominant T-cell 
antigens that induce potent Thl responses (F. Tekaia, et al., Tubercle Lung Disease 79, 
329 (1999) ; A. L. Sorensen, et al, Infect, hnmun. 63, 1710 (1995) and R. Colangelli, et 
al., Infect, hnmun. 68, 990 (2000)). 

In summary, we have discovered that the restoration of RD1 to M. bovis BCG leads to 
increased persistence in immunocompetent mice. The M. bovis BCG::RD1 strain 
induces RD1 -specific immune responses of the Thl -type, has enhanced immunogenicity 
and confers better protection than M. bovis BCG alone in the mouse model of 
tuberculosis. The M. bovis BCG::RD1 vaccine is significantly more virulent than M. 
bovis BCG in immunodeficient mice but considerably less virulent than M. tuberculosis. 

In addition, we show that M. microti lacks a different but overlapping part of the RD1 
region (RDl m,c ) to M. bovis BCG and our results indicate that reintroduction of RD1 
confers increased virulence of BCG ::RD1 in immunodeficient mice. The rare strains of 
M. microti that are associated with human disease contain a region referred to as RD5 mic 
whereas those from voles do not. 



5 

At. bovis BCG vaccine could be improved by reintroducing other genes encoding ESAT- 
6 family members that have been lost, notably, those found in the RD8 and RD5 loci of 
M. tuberculosis. These regions also code for additional T-cell antigens. 

5 At; bovis BCG::RD1 could be improved by reintroducing the RD8 and RD5 loci of M. 
tuberculosis. 

At. bovis BCG vaccine could be improved by overexpressing the genes contained in the 
RD1, RD5 and RD8 regions. 

Accordingly, these new strains, showing greater persistence and enhanced 
immunogenicity, represent an improved vaccine for preventing tuberculosis and treating 
bladder cancer. 

In addition, the greater persistence of these recombinant stains is an advantage for the 
presentation of other antigens, for instance from HIV in humans and in order to induce 
protection immune responses. Those improved strains may also be of use in veterinary 
medicine, for instance in preventing bovine tuberculosis. 

Description 

20 Therefore, the present invention is aimed at a strain of At. bovis BCG or M microti, 
wherein said strain has integrated all or part of the RD1 region responsible for enhanced 
immunogenicity to the tubercle bacilli. These strains will be referred as the At. bovis 
BCG::RD1 orM microti: :RD1 strains. 





In connection with the invention, "part or all of the RD1 region" means that the strain 
25 has integrated a portion of DNA originating from Mycobacterium tuberculosis, which 
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comprises at least one gene selected from Rv3871, Rv3872 (mycobacterial PE), Rv3873 
(PPE), Rv3874 (CFP-10), Rv3875 (ESAT-6), and Rv3876. The expression gene is 
referred herein as the coding sequence in frame with its natural promoter as well as the 
coding sequence which has been isolated and framed with an exogenous promoter, for 
5 example a promoter capable of directing high level of expression of said coding 
sequence. 

In a specific aspect, the invention relates to a strain of M. bovis BCG or M. microti 
wherein said strain has integrated at least one gene selected from Rv3871, Rv3872 (SEQ 
ID No 2, mycobacterial PE), Rv3873 (SEQ ID No 3, PPE), Rv3874 (S EQ I D No 4, CFP- 
10 10), Rv3875 (SEQ ID No 5, ESAT-6), and Rv3876, preferably CFP-10, ESAT-6 or both 
(SEQ ID No 6). 

These genes can be mutated (deletion, insertion or base modification) so as to maintain 
the improved immunogenicity while decreasing the virulence of the strains. Using 
routine procedure, the man skilled in the art can select the M bovis BCG::RD1 or M. 
15 microtinKDl strains, in which a mutated gene has been integrated, showing improved 
immunogenicity and lower virulence. 

We have shown here that introduction of the RD1 region makes the vaccine strains 
induce a more effective immune response against a challenge with M tuberculosis. 
However, this first generation of constructs can be followed by other, more fine-tuned 
20 generations of constructs as the complemented BCG::RD1 vaccine strain also showed a 
more virulent phenotype in severely immunocompromised (SOD) mice. Therefore, the 
BCG RD1+ constructs may be modified to as to be applicable as vaccine strains while 
being safe for immunocompromised individuals. 

25 In this perspective, the man skilled in the art can adapt the BCG::RD1 strain by the 
conception of BCG vaccine strains that only carry parts of the genes coding for ESAT-6 
or CFP-10 in a mycobacterial expression vector (for example pSM81) under the control 
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15 



20 



25 



of a promoter, more particularly an hsp60 promoter. For example, at least one portion of 
the esat-6 gene that codes for immunogenic 20-mer peptides of ESAT-6 active as T-cell 
epitopes (Mustafa AS, Oftung F, Amoudy HA, Madi NM, Abal AT, Shaban F, Rosen 
Krands I, & Andersen P. (2000) Multiple epitopes from the Mycobacterium tuberculosis 
ESAT-6 antigen are recognized by antigen-specific human T cell lines. Clin Infect Dis. 
30 Suppl 3:S201-5, peptides PI to P8 are incorporated herein in the descritption) could 
be cloned into this vector and electroporated into BCG, resulting in a BCG strain that 
produces these epitopes. 

Alternatively, the ESAT-6 and CFP-10 encoding genes (for example on plasmid RD1- 
AP34 and or RD1-2F9) could be altered by directed mutagenesis (using for example, 
QuikChange Site-Directed Mutagenesis Kit from Stratagen) in a way that most of the 
immunogenic peptides of ESAT-6 remain intact, but the biological functionality of ; 
ESAT-6 is lost. 

This approach could result in a more protective BCG vaccine without increasing the 
virulence of the recombinant BCG construct. 

Therefore, the invention is also aimed at a method for preparing and selecting M. bovis 
BCG or M. microti strains comprising a step consisting of modifying the M. bovis 
BCG::DR1 orM microt::DRl strains as defined above by insertion, deletion or mutation 
in the integrated DR1 region, more particularly in the esat-6 or CFP-10 gene, said 
method leading to strains that are less virulent for immuno-depressed individuals. 
Together, these methods would allow to explain what causes the effect that we see with 
our BCG::RD1 strain (the presence of additional T-cell epitopes from ESAT-6 and 
CFP10 resulting in increased immunogenicity) or whether the effect is caused by better 
fitness of the recombinant BCG::RD1 clones resulting in longer exposure time of the 
immune system to the vaccine - or - by a combinatorial effect of both factors. 
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In a preferred embodiment, the invention is aimed at the M. bo vis BCG::RD1 strains, 
which have integrated a cosmid herein referred to as the RD1-2F9 and RD1-AP34 
contained in the E. coli strains deposited on April 2, 2002 at the CNCM (Institut Pasteur, 
25, rue du Docteur Roux, 75724 Paris cedex 15, France) under the accession number I- 
5 2831 and 1-2832 respectively. The RD1-2F9 is a cosmid comprising a portion of the 
Mycobacterium tuberculosis H37Rv genome that spans the RD1 region and the 
hygromycin resistance gene. The RD1-AP34 is a cosmid comprising a portion of the 
Mycobacterium tuberculosis DNA containing two genes coding for ESAT-6 and CFP-10 
as well as a gene conferring resistance to Kanamycin. 



The construct RD1-AP34 contains a 3909 bp fragment of the Af. tuberculosis H37Rv 
genome from region 4350459 bp to 4354367 bp cloned into an integrating vector pKint 
(SEQ ID No 1). The Accession No. of the segment 160 of the M. tuberculosis H37Rv 
genome that contains this region is AL022120. 

SEQ ID No 1 

1 - gaattcccat ccagtgagtt caaggtcaag cggcgccccc ctggccaggc atttctcgtc 
61 - tcgccagacg gcaaagaggt catccaggcc ccctacatcg agcctccaga agaagtgttc 
121 - gcagcacccc caagcgccgg ttaagattat ttcattgccg gtgtagcagg acccgagctc 
181 - agcccggtaa tcgagttcgg gcaatgctga ccatcgggtt tgtttccggc tataaccgaa 
241 - cggtttgtgt acgggataca aatacaggga gggaagaagt aggcaaatgg aaaaaatgtc 
301 - acatgatccg atcgctgccg acattggcac gcaagtgagc gacaacgctc tgcacggcgt 
361 - gacggccggc tcgacggcgc tgacgtcggt gaccgggctg gttcccgcgg gggccgatga 
42 J - ggtctccgcc caagcggcga cggcgttcac atcggagggc atccaattgc tggcttccaa 
481 - tgcatcggcc caagaccagc tccaccgtgc gggcgaagcg gtccaggacg tcgcccgcac 
541 - ctattcgcaa atcgacgacg gcgccgccgg cgtcttcgcc gaateggccc ccaacacatc 
601 - ggagggagtg atcac catgc tgtggcacgc aatgccaccg gagctaaata ccgcacggct 



661 - gatggccggc g cgg gtccgg ctccaatgct tgcggcggcc gcgggatggc agacgct t t c 
721 - ggcggctctg gacgctcagg ccgtcgagtt gaccgcgcgc ctgaactc tc tgggagaagc 
781 - ctggactgga ggtggcagcg acaaggcgct tgcggctgca acgccgatgg tggtctggct 
841 - acaaaccgcg tcaacacagg ccaaeacccg tgcgat gcag gcga cggcgc aag ccgcggc 
901 - atacacccag gccatggcca cgacgccgtc gctgccggag atcgccgcca accacatcac 
961 - ccaggccgtc cttacggcca ccaacttctt cggtatcaac acgatcccga tcgcgttg ac 
1021 - cgagatggat tatttcatcc gtatgtggaa ccaggcagcc ctggcaatgg aggtctacca 
1081 - ggccgagacc gcggttaaca cgcttttcga gaagctcgag ccgatg gcgt cgatccttga 
1141 - tcccggcgcg agccagagca cgacgaaccc gatcttcgga atgccctccc ctggcagctc 
1201 - aacaccggtt ggccagttgc cgccggcggc tacccagacc ctcggccaac tgggtgagat 
1261 - gagcggcccg atgcagcagc tgacccagcc gctgcagcag gtgacgtcgt tgttcagcca 
1321 - ggtgggcggc accggcggcg gcaacccagc cgacgaggaa gccgcgcaga tgggcctgct 
1381 - cggcaccagt ccgctgtcga accatccgct ggctggtgga tcaggcccca gcgcgggcgc 
1441 - gggcctgctg cgcgcggagt cgctacctgg cgcaggtggg tcgttgaccc gcacgccgct 
1501 - gatgtctcag ctgatcgaaa agccggttgc cccctcggtg atgccggcgg ctgctgccgg 
1561 - atcgtcggcg acgggtggcg ccgctccggt gggtgcggg a gcgatgggcc agggtgcgca 
1621 - atccggcggc tccaccaggc cgggtctggt cgcgccggca ccgctcgcgc aggagcgtga 
1681 - agaagacgac gaggacgact gggacgaaga ggacgactgg tgagctcccg taatgacaac 
1741 - agacttcccg gccacccggg ccggaagact tgccaacatt ttggcgagga aggtaaagag 
1801 - agaaagtagt ccagcatggc agagatgaag accgatgccg ctaccctcgc gcaggaggca 
1861 - ggtaatttcg agcggatctc cggcgacctg aaaacccaga tcgaccaggt ggagtcgacg 
1921 - gcaggttcgt tgcagggcca gtggcgcggc gcggcgggga cggccgccca ggccgcggtg 
1981 - gtgcgcttcc aagaagcagc caataagcag aagcaggaac tcgacgagat ctcgacgaat 
2041 - attcgtcagg ccggcgtcca atactcgagg gccgacgagg agcagcagca ggcgctgtcc 
2101 - tcgcaaatgg gcttctgacc cgctaatacg aaaagaaacg gagcaaaaac atzacazazc 
2161 - azcaztzzaa tttcgczzgt atczazzccg czzcaazczc aatcca gzza aatztcaczt 
2221 - ccattcattc cctccttzac gaggggaagc agtccctgac caagctcgca z czzcctzzz 
2281 - gczgtagcgg ttcgzagzcz taccazzztz tccazcaaaa atzz gaczcc aczzctaccz 
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2341 
2401 



- azcteaacaa czczctzcaz aacctzzcg c zzaczatcaz czaazcczzt cazzcaatgg 
~-rv - cttczaccza azgcaacztc actzzzatzt tczca taggg caacgccgag ttcgcgtaga 
2461 - atagcgaaac acgggatcgg gcgagttcga ccttccgtcg gtctcgccct ttctcgtgtt 
2521 - tatacgtttg agcgcactct gagaggttgt catggcggcc gactacgaca agctcttccg 
2581 - gccgcacgaa ggtatggaag ctccggacga tatggcagcg cagccgttct tcgaccccag 
2641 - tgcttcgttt ccgccggcgc ccgcatcggc aaacctaccg aagcccaacg gccagactcc 
2701 - gcccccgacg tccgacgacc tgtcggagcg gttcgtgtcg gccccgccgc cgccaccccc 
2761 - acccccacct ccgcctccgc caactccgat gccgatcgcc gcaggagagc cgccctcgcc 
2821 - ggaaccggcc gcatctaaac cacccacacc ccccatgccc atcgccggac ccgaaccggc 
2881 - cccacccaaa ccacccacac cccccatgcc catcgccgga cccgaaccgg ccccacccaa 
2941 - accacccaca cctccgatgc ccatcgccgg acctgcaccc accccaaccg aatcccagtt 
3001 - ggcgcccccc agaccaccga caccacaaac gccaaccgga gcgccgcagc aaccggaatc 
3061 - accggcgccc cacgtaccct cgcacgggcc acatcaaccc cggcgcaccg caccagcacc 
3121 - gccctgggca aagatgccaa tcggcgaacc cccgcccgct ccgtccagac cgtctgcgtc 
3181 - cccggccgaa ccaccgaccc ggcctgcccc ccaacactcc cgacgtgcgc gccggggtca 
3241 - ccgctatcgc acagacaccg aacgaaacgt cgggaaggta gcaactggtc catccatcca 
3301 - ggcgcggctg cgggcagagg aagcatccgg'cgcgcagctc gcccccggaa cggagccctc 
3361 - gccagcgccg ttgggccaac cgagatcgta tctggctccg cccacccgcc ccgcgccgac 
3421 - agaacctccc cccagcccct cgccgcagcg caactccggt cggcgtgccg agcgacgcgt 
3481 - ccaccccgat ttagccgccc aacatgccgc ggcgcaacct gattcaatta cggccgcaac 
3541 - cactggcggt cgtcgccgca agcgtgcagc gccggatctc gacgcgacac agaaatcctt 
3601 - aaggccggcg gccaaggggc cgaaggtgaa gaaggtgaag ccccagaaac cgaaggccac 
3661 - gaagccgccc aaagtggtgt cgcagcgcgg ctggcgacat tgggtgcatg cgttgacgcg 
3721 - aatcaacctg ggcctgtcac ccgacgagaa gtacgagctg gacctgcacg ctcgagtccg 
3781 - ccgcaatccc cgcgggtcgt atcagatcgc cgtcgtcggt ctcaaaggtg gggctggcaa 
3841 - aaccacgctg acagcagcgt tggggtcgac gttggctcag gtgcgggccg accggatcct 
3901 -ggctctaga 
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pos. 0001-0006 EcoRI-restriction site 

pos. 0286-0583 Rv3872 coding for a PE-Protein (SEQ ID No 2) 
pos. 0616-1720 Rv.3873 coding for a PPE-Protein (SEQ ID No 3) 

pos. 1816-2115 Rv3874 coding for Culture Filtrat protein lOkD (CFP10) (SEQ ID 
No 4) 

pos. 2151-2435 Rv3875 coding for Earlv Secreted Ant ie en T arget 6 kD (ESAT6) (SEQ 
ID No 5) 

pos. 3903-3609 Xbal-restriction site 

pos. 1816-2435 CFP-10 gene + esat-6 gene (SEQ ID No 6) 



The sequence of the fragment RD1-2F9 (-32 kb) covers the region of the M. 
tuberculosis genome AL1 23456 from ca 4337 kb to ca. 4369 kb, and also contains the 
sequence described above. 

Such strains fulfill the aim of the invention which is to provide an improved tuberculosis 
15 vaccine or M. bovis BCG-based prophylactic or therapeutic agent, or a recombinant M. 
microti derivative for these purposes. 

The above described M. bovis BCG::RD1 strains are better tuberculosis vaccines than M. 
bovis BCG. These strains can also be improved by reintroducing other genes found in the 
20 RD8 and RD5 loci of M. tuberculosis. These regions code for additional T-cell antigens. 
As indicated, overexpressing the genes contained in the RD1, RD5 and RD8 regions by 
means of exogenous promoters is encompassed by the invention. The same applies 
regarding M. microti::RDl strains. M. microti strains could also be improved by 
reintroducing the RD8 locus of M. tuberculosis. 



In a second embodiment, the invention is directed to a cosmid or a plasmid comprising 
part or all of the RD1 region originating from Mycobacterium tuberculosis, said region 
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comprising at least one gene selected from Rv3871, Rv3872 (mycobacterial PE), 
Rv3873 (PPE), Rv3874 (CFP-10), Rv3875 (ESAT-6), and Rv3876. Preferably, such 
cosmids or plasmid comprises CFP-10, ESAT-6 or both. The invention also relates to 
the use of these cosmids or plasmids for transforming M bovis BCG or M microti 
5 strains. As indicated above, these cosmids or plasmids may comprises a mutated gene 
selected from Rv3871 to Rv3876, said mutated gene being responsible for the improved 
immunogenicity and decreased virulence. 

In another embodiment, the invention embraces a pharmaceutical composition 
_1_Q ci>mprisingjLSto^ 

In addition to the strains, these pharmaceutical compositions may contain suitable 
pharmaceutically-acceptable carriers comprising excipients and auxiliaries which 
facilitate processing of the living vaccine into preparations which can be used 
15 pharmaceutically. Further details on techniques for formulation and administration may 
be found in the latest edition of Remington's Pharmaceutical Sciences (Maack 
Publishing Co., Easton, Pa.). 

Preferably, such composition is suitable for oral intravenous or subcutaneous 
administration. 

20 The determination of the effective dose is well within the capability of those skilled in 
the art. A therapeutically effective dose refers to that amount of active ingredient, i.e the 
number of strains administered, which ameliorates the symptoms or condition. 
Therapeutic efficacy and toxicity may be determined by standard pharmaceutical 
procedures in experimental animals, e.g., ED50 (the dose therapeutically effective in 

25 50% of the population) and LD50 (the dose lethal to 50% of the population). The dose 
ratio of toxic to therapeutic effects is the therapeutic index, and it can be expressed as the 
ratio, LD50/ED50. Pharmaceutical compositions which exhibit large therapeutic indices 




13 

are preferred. Of course, ED50 is to be modulated according to the mammal to be treated 
or vaccinated. In this regard, the invention contemplates a composition suitable for 
human administration as well as veterinary composition. 

The invention is also aimed at a vaccine comprising a M. bovis BCG::RD1 or M. 
microti:\KD\ strain as depicted above and a suitable carrier. This vaccine is especially 
useful for preventing tuberculosis. It can also be used for treating bladder cancer. 

The invention also concerns a product comprising a strain as depicted above and at least 
one protein selected from ESAT-6 and CFP-10 or epitope derived thereof for a separate, 
simultaneous or sequential use for treating tuberculosis. 

10 In still another embodiment, the invention concerns the use of a M bovis BCG::RD1 or 
M. microti:\KD\ strain as depicted above for preventing or treating tuberculosis. - 
It also concerns the use of a M bovis BCG::RD1 or M microti::KDl strain as a powerful 
adjuvant/immunomodulator used in the treatment of superficial bladder cancer* 

The invention also contemplates the identification at the species level of members of the 
M. tuberculosis complex by means of an RD-based molecular diagnostic test. Inclusion 
of markers for RDl mic and RD5™ C would improve the tests and act as predictors of 
virulence, especially in humans. In this regard, the invention concerns a diagnostic kit 
comprising DNA probes and primers specifically hybridizing to a DNA portion of the 
RD1 or RD5 region, more particularly probes hybridizing under stringent conditions to a 
gene selected from Rv3871, Rv3872 (mycobacterial PE), Rv3873 (PPE), Rv3874 (CFP- 
10), Rv3875 (ESAT-6), and Rv3876, preferably CFP-10 and ESAT-6. As used herein, 
the term "stringent conditions" refers to conditions which permit hybridization between 
the probe sequences and the polynucleotide sequence to be detected. Suitably stringent 
conditions can be defined by, for example, the concentrations of salt or formamide in the 
prehybridization and hybridization solutions, or by the hybridization temperature, and 




15 



20 



25 



14 



are well known in the art. In particular, stringency can be increased by reducing the 
concentration of salt, increasing the concentration of formamide, or raising the 
hybridization temperature. The temperature range corresponding to a particular level of 
stringency can be further narrowed by calculating the purine to pyrimidine ratio of the 
5 nucleic acid of interest and adjusting the temperature accordingly. Variations on the 
above ranges and conditions are well known in the art. 



Among the preferred primers, we can cite: 

primer esat-6F GTCACGTCCATTCATTCCCT (SEQ ID No 9), 

10 primer esat- 6R A TCCC AGTGACGTT GCCTT) (SEQ ID No 1_0),_ 

primer RDl mic flanking region F GCAGTGCAAAGGTGCAGATA (SEQ ID No 1 1), 
primer RDl m,c flanking region R GATTGAGACACTTGCCACGA (SEQ ID No 12), 
primer RD5 mic flanking region F GAATG CCGACGTCATATCG (SEQ ID No 16), 
primer RD5 mic flanking region R CGGCCACTGAGTTCGATTAT (SEQ ID No 17). 

15 

The present invention covers also the complentary nucleotidic sequences of said above 
primers lis well as the nucleotidic sequences hybridizing under stringent conditions with 
them and having at least 20 nucleotides and less than 500 nucleotides. 

20 Diagnostic kits for the identification at the species level of members of the M. 
tuberculosis comprising antibodies directed to mycobacterial PE, PPE, CFP-10 and 
ESAT-6 are also embraced by the invention. As used herein, the term "antibody" refers, 
to intact molecules as well as fragments thereof, such as Fab, F(ab f ).sub.2, and Fv, which 
are capable of binding the epitopic determinant. Probes or antibodies can be labeled with 

25 isotopes, fluorescent or phosphorescent molecules or by any other means known in the 
art. 



The invention is further detailed below and will be illustrated with the following figures. 
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Figure legends 

Figure 1: M. bovis BCG and M. microti have a chromosomal deletion, RDl, 
spanning the cfpl0-esat6 locus. 

(A) Map of the cfpl0-esai6 region showing the six possible reading frames and the M. 
5 tuberculosis H37Rv gene predictions. This map is also available at: 
(htt p://genolist.pasteur.f r/TubercuList/). 

The deleted regions are shown for BCG (red) and M. microti (blue) with their respective 
H37Rv genome coordinates, and the extent of the conserved ESAT-6 locus (F. Tekaia., e / 
al, Tubercle Lung Disease 19, 329 (1999);, is indicated by the gray bar. 

10 (B) Table showing characteristics of deleted regions selected for complementation, 
analysis. Potential virulence factors and their putative functions disrupted by. each, 
deletion are shown. The coordinates are for the M. tuberculosis H37Rv genome. 

(Q Clones used to complement BCG. Individual clones spanning RDl regions (RD1- 
1106 and RD1-2F9) were selected from an ordered M. tuberculosis genomic library (R.B. 

15 unpublished) in pYUB412 (S. T. Cole, et al, Nature 393, 537 (1998) and W. R. Bange, 
F. M. Collins, W. R. Jacobs, Jr., Tuber. Lung Dis. 79, 171 (1999); and electroporated 
into M. bovis BCG strains, or M. microti. Hygromycin-resistant transformants were 
verified using PCR specific for the corresponding genes. P AP35 was derived from RD1- 
2F9 by excision of an Afltt fragment. pAP34 was constructed by subcloning an EcoRl- 

20 Xbal fragment into the integrative vector pKINT. The ends of each fragment are related 
to the BCG RDl deletion (shaded box) with black lines and the H37Rv coordinates for 
the other fragment ends given in kilobases. 

(D) hnmunoblot analysis, using an ESAT-6 monoclonal antibody, of whole cell protein 
extracts from log-phase cultures of H37Rv (S. T. Cole, et al, Nature 393, 537 (1998)), 
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BCG::pYUB412 (M. A, Behr, et aL, Science 284, 1520 (1999)), BCG::RD1-I106 (R. 
Brosch, et al, Infection Immun. 66, 2221 (1998)), BCG::RD1-2F9 (S. V. Gordon, et al, 
Molec Microbiol 32, 643 (1999)), M bovis (H. Salamon et al, Genome Res 10, 2044 
(2000)), Mycobacterium smegmatis (G. G. Mahairas, et al, J. Bacteriol 178, 1274 
(1996)), M. smegmatis::vYXJB4\2 y and M. smegmatis:: RD1-2F9 (R. Brosch, e/ a/., Proc 
Natl Acad Sci USA 99, 3684 (2002)). 

Figure 2: Complementation of BCG Pasteur with the RD1 region alters the colony 
morphology and leads to accumulation of Rv3873 and ESAT-6 in the cell wall. 

(A) Serial dilutions of 3 week old cultures of BCG::pYUB412, BCG::I106 or 
BCG::RD1-2F9 growing on Middlebrook 7H1 0 agar plates. The white square shows the 
area of the plate magnified in the image to the right. 

(B) Light microscope image at fifty fold magnification of BCG::pYUB412 and 
BCG::RD1-2F9 colonies. 5 \j! drops of bacterial suspensions of each strain were spotted 
adjacently onto 7H10 plates and imaged after 10 days growth, illuminating the colonies 
through the agar. 

(C) Immunoblot analysis of different cell fractions of H37Rv obtained from 
http://www.cvmbs.co]ostate>edu/microbioloiyv/tb/ResearchMA.html using either an anti- 
ESAT-6 antibody or 

(D) anti-Rv3873 (PPE) rabbit serum. H37Rv and BCG signify whole cell extracts from 
the respective bacteria and Cyt, Mem and CW correspond to the cytosolic, membrane 
and cell wall fractions of M tuberculosis H37Rv. 

Figure 3: Complementation of BCG Pasteur with the RD1 region increases 
bacterial persistence and pathogenicity in mice. 
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(A) Bacteria in the spleen and lungs of BALB/c mice following intravenous (i.v.) 
infection via the lateral tail vein with 10 6 colony forming units (cfu) of M. tuberculosis 
H37Rv (red) or 10 7 cfu of either BCG::pYUB412 (yellow) or BCG::RD1-I106 (green). 

(B) Bacterial persistence in the spleen and lungs of C57BL/6. mice following i.v. 
infection with 10 s cfu of BCG::pYUB412 (yellow), BCG::RD1-I106 (green) or 
BCG::RD1-2F9 (blue). 

(C) Bacterial multiplication after i.v. infection with 10 6 cfu of BCG::pYUB412 (yellow) 
and BCG::RD1-2F9 (blue) in severe combined immunodeficiency mice (SCID). For A, 
B, and C each timepoint is the mean of 3 to 4 mice and the error bars represent standard 
deviations. 

(D) Spleens from SCID mice three weeks after i.v. infection with 10 6 cfu of either 
BCG::pYUB412, BCG::RD1-2F9 or BCG::I301 (an RD3 "knock-in", Fig. IB). The 
scale is in cm. 

Figure 4: Immunisation of mice with BCG::RD1 generates marked ESAT-6 specific 
T-cell responses and enhanced protection to a challenge with M. tuberculosis. 

(A) Proliferative response of splenocytes of C57BL/6 mice immunised subcutaneously 
(s.c.) with 10 6 CFU of BCG::pYUB412 (open squares) or BCG::RD1-2F9 (solid 
squares) to in vitro stimulation with various concentrations of synthetic peptides from 
poliovirus type 1 capsid protein VP1, ESAT-6 or Ag85A (K. Huygen, et al., Infect. 
Jmmun. 62, 363 (1994), L. Brandt, JJmmunol. 157, 3527 (1996) and C. Leclerc et al, J. 
Virol. 65,711 (1991);. 

(B) Proliferation of splenocytes from BCG::RDl-2F9-immunised mice in the absence or 
presence of 10 ug/ml of ESAT-6 1-20 peptide, with or without 1 ug/ml of anti-CD4 
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(GK1.5) or anti-CD8 (H35-17-2) monoclonal antibody. Results are expressed as mean 
and standard deviation of ^-thymidine incorporation from duplicate wells. 

(C) Concentration of IFN-y in culture supernatants of splenocytes of C57BL/6 mice 
stimulated for 72 h with peptides or PPD after s.c. or i.v. immunisation with either 
5 BCG::pYUB412 (red and yellow) or BCG::RD1-2F9 (green and blue). Mice were 
inoculated with either 10 6 (yellow and green) or 10 7 (red and blue) cfu. Levels of IFN-y 
were quantified using a sandwich ELISA (detection limit of 500 pg/ml) with the mAbs 
R4-6A2 and biotin-conjugated XMG1.2. Results are expressed as the mean and 
standard deviation of duplicate culture wells. 

10 (D) Bacterial counts in the spleen and lungs of vaccinated and unvaccinated BALB/c 
mice 2 months after an i.v. challenge with M tuberculosis H37Rv. The mice were 
challenged 2 months after i.v. inoculation with 10 6 cfu of either BCG::pYUB412 or 
BCG::RD1-2F9. Organ homogenates for bacterial enumeration were plated on 7H11 
medium, with or without hygromycin, to differentiate M tuberculosis from residual 

15 BCG colonies. Results are expressed as the mean and standard deviation of 4 to 5 mice 
and the levels of significance derived using the Wilcoxon rang sum test. 

Figure 5: Mycobacterium microti strain OV254 BAC map (named MiXXX), overlaid 
on the M. tuberculosis H37Rv (named RvXXX) and M. bovis AF2 122/97 (named 
20 MbXXX) BAC maps. The scale bars indicate the position on the M. tuberculosis 
genome. 

Figure 6: Difference in the region 4340-4360 kb between the deletion in BCG RDl bcg 
(A) and in M. microti RDl mic (C) relatively to M tuberculosis H37Rv (B). 

25 
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Figure 7: Difference in the region 3121-3127 kb between At. tuberculosis H37Rv (A) 
and At. microti OV254 (B). Gray boxes picture the direct repeats (DR), black ones the 
unique numbered spacer sequences. * spacer sequence identical to the one of spacer 58 
reported by van Embden et at. (42). Note that spacers 33-36 and 20-22 are not shown 
5 because H37Rv lacks these spacers. 

Figure 8: A) Asel PFGE profiles of various At. microti strains; Hybridization with a 
radiolabeled B) esat-6 probe; C) probe of the RDl mic flanking region; D) plcA probe. 1 . 
At. bovis AF2122/97, 2. M. canetti, 3. At. bovis BCG Pasteur, 4. At. tuberculosis H37Rv, 
5. At. microti OV254, 6. At. microti Myc 94-2272, 7. M. microti B3 type mouse, 8. At. \ 
microti B4 type mouse , 9. At. microti B2 type llama, 10. At. microti Bl type llama, 11. 
At. microti ATCC 35782. M: Low range PFGE marker (NEB). 

Figure 9: PCR products obtained from various At. microti strains using primers that 
flank the RDl mic region, for amplifying ESAT-6 antigen, that flank the MiD2 region. 1. 
M. microti Bl type llama, 2. At. microti B4 type mouse, 3. At. microti B3 type mouse, 4. 
At. microti B2 type llama, 5. At. microti ATCC 35782, 6. At. microti OV254, 7. At. 
microti Myc 94-2272, 8. At. tuberculosis H37Rv. 
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Example 1; preparation and assessment of M. bovis BCG:;RD1 strains as a vaccine 
for treating or preventing tuberculosis. 

As mentioned above, we have found that complementation with RD1 was accompanied 
by a change in colonial appearance as the BCG Pasteur "knock-in" strains developed a 
25 strikingly different morphotype (Fig. 2A). The RD1 complemented strains adopted a 
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spreading, less-rugose morphology, that is characteristic of M. bovis, and this was more 
apparent when the colonies were inspected by light microscopy (Fig. 2B). Maps of the 
clones used are shown (Fig. 1C). These changes were seen following complementation 
with all of the RD1 constructs (Fig. 1C) and on complementing M microti (data not 
5 shown). Pertinently, Calmette and Guerin (A. Calmette, La vaccination preventive 
contre la tuberculose. (Masson et cie., Paris, 1921)) observed a change in colony 
morphology during their initial passaging of M. bovis, and our experiments now 
demonstrate that this change, corresponding to loss of RD1, directly contributed to 
attenuating this virulent strain. The integrity of the cell wall is known to be a key 

_L0 yiixtlenc_ejdetermm^ 

and changes in both cell wall lipids (M. S. Glickman, J. S. Cox, W. R. Jacobs, Jr., Mol 
Cell 5, 717 (2000); and protein (F. X. Berthet, et al 9 Science 282, 759 (1998); have been 
shown to alter colony morphology and diminish persistence in animal models. 

To determine which genes were implicated in these morphological changes, antibodies 
15 recognising three RD1 proteins (Rv3873, CFP10 and ESAT-6) were used in 

~ immunocytological and subcellular fractionation analysis. When the different cell 

fractions from M. tuberculosis were immunoblotted all three proteins were localized in 
the cell wall fraction (Fig. 2C) though significant quantities of Rv3873, a PPE protein, 
were also detected in the membrane and cytosolic fractions (Fig. 2D). Using 
20 immunogold staining and electron microscopy the presence of ESAT-6 in the envelope 
of M tuberculosis was confirmed but no alteration in capsular ultrastructure could be 
detected (data not shown). Previously, CFP-10 and ESAT-6 have been considered as 
secreted proteins (F. X. Berthet et al, Microbiology 144, 3195 (1998)) but our results 
suggest that their biological functions are linked directly with the cell wall. 
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Changes in colonial morphology are often accompanied by altered bacterial virulence. 
Initial assessment of the growth of different BCG::RD1 "knock-ins" in C57BL/6 or 
BALB/c mice following intravenous infection revealed that complementation did not 
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restore levels of virulence to those of the reference strain M. tuberculosis H37Rv (Fig. 
3A). In longer-term experiments, modest yet significant differences were detected in the 
persistence of the BCG::RD1 "knock-ins" in comparison to BCG controls. Following 
intravenous infection of C57BIV6 mice, only the RD1 "knock-ins" were still detectable 
in the lungs after 106 days (Fig. 3B). This difference in virulence between the RD1 
recombinants and the BCG vector control was more pronounced in severe combined 
immunodeficiency (SCID) mice (Fig. 3C). The BCG::RD1 "knock-in" was markedly 
more virulent, as evidenced by the growth rate in lungs and spleen and also by an 
increased degree of splenomegaly (Fig. 3D). Cytological examination revealed 
numerous bacilli, extensive cellular infiltration and granuloma formation. . . These 
increases in virulence following complementation with the RD1 region, demonstrate that 
the loss of mis genomic locus contributed to the attenuation of BCG. 

The inability to restore full virulence to BCG Pasteur was not due to instability of our 
constructs nor to the strain used (data not shown). Essentially identical results were 
obtained on complementing BCG Russia, a strain less passaged than BCG Pasteur and 
presumed, therefore, to be closer to the original ancestor (M. A. Behr, et al, Science 284, 
1520 (1999)). This indicates that the attenuation of BCG was a polymutational process 
and loss of residual virulence for animals was documented in the late 1920s (T. 
Oettinger, et al, Tuber Lung Dis 79, 243 (1999);. Using the same experimental strategy, 
we also tested the effects of complementing with RD3-5, RD7 and RD9 (S. T. Cole, et 
al, Nature 393, 537 (1998) ; M. A. Behr, et al, Science 284, 1520 (1999) ; R. Brosch, et 
al, Infection Immun. 66, 2221 (1998) and S. V. Gordon et al, Molec Microbiol 32, 643 
(1999)) encoding putative virulence factors (Fig. IB). Reintroduction of these regions, 
which are not restricted to avirulent strains, did not affect virulence in immuno- 
competent mice. Although it is possible that deletion effects act synergisticalry it seems 
more plausible that other attenuating mechanisms are at play. 
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Since RD1 encodes at least two potent T-cell antigens (R. Colangelli, et al, Infect 
Immun. 68, 990 (2000), M. Harboe, et al, Infect Immun. 66, 717 (1998) and R. L. V. 
Skj0t, et al> Infect. Immun. 68, 214 (2000)), we investigated whether its restoration 
induced immune responses to these antigens or even improved the protective capacity of 
BCG. Three weeks following either intravenous or subcutaneous inoculation with 
BCG::RD1 or BCG controls, we observed similar proliferation of splenocytes to an 
Ag85A (an antigenic BCG protein) peptide (K. Huygen, et al., Infect Immun. 62, 363 
(1994)), but not against a control viral peptide (Fig. 4A). Moreover, BCG::RD1 
generated powerful CD4 + T-cell responses against the ESAT-6 peptide as shown by 
splenocyte^roliferatfon-^ 

the BCG::pYUB412 control did not stimulate ESAT-6 specific T-cell responses thus 
indicating that these were mediated by the RD1 locus. ESAT-6 is, therefore, highly 
immunogenic in mice in the context of recombinant BCG. 

When used as a subunit vaccine, ESAT-6 elicits T-cell responses and induces levels of 
protection weaker than but akin to those of BCG (L. Brandt et al, Infect Immun. 68, 791 
(2000)). Challenge experiments were conducted to determine if induction of immune 
responses to BCG::RDl-encoded antigens, such as ESAT-6, could improve protection 
against infection with M. tuberculosis. Groups of mice inoculated with either 
BCG::pYUB412 or BCG::RD1 were subsequently infected intravenously with M. 
tuberculosis H37Rv. These experiments showed that immunisation with the BCG::RD1 
"knock-in" inhibited the growth of M tuberculosis within both BALB/c (Fig. 4D) and 
C57BL/6 mice when compared to inoculation with BCG alone. 

Although the increases in protection induced by BCG::RD1 and the BCG control are 
modest they demonstrate convincingly that genetic differences have developed between 
the live vaccine and the pathogen which have weakened the protective capacity of BCG. 
This study therefore defines the genetic basis of a compromise that has occurred, during 
the attenuation process, between loss of virulence and reduced protection (M. A. Behr, P. 
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M. Small, Nature 389, 133 (1997);. The recombinant BCGs presented here may not be 
appropriate in their current form as vaccine candidates because of uncertainty about their 
safety. However, the strategy of reintroducing, or even overproducing (M. A. Horwitz et 
al, Proc Natl Acad Sci US A 97, 13853 (2000);, the missing immunodominant antigens 
of M. tuberculosis in BCG, could be combined with an immuno-neutral attenuating 
mutation to create a more efficacious tuberculosis vaccine. 
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Tr.-yainr>lp 2: BA C ***** genomics identifies Mycobacterium microti as a 

natural ESAT-6 deletio n mutant- 

We searched for any genetic differences between human and vole isolates that might 
explain their different degree of virulence and host preference and what makes the vole 
isolates harmless for humans. In this regard, comparative genomics methods were 
employed in connection with the present invention to identify major differences that may 
exist between ihe M. microti reference strain OV254 and the entirely sequenced; strains 
of M. tuberculosis H37Rv (10) or M. bovis AF2122/97 (14). An ordered Bacterial 
Artificial Chromosome (BAC) library of M. microti OV254 was constructed and 
individual BAC to BAC comparison of a minimal set of these clones with BAC clones 
from previously constructed libraries of M. tuberculosis H37Rv andM bovis AF2122/97 
was undertaken. 

Ten regions were detected in M. microti that were different to the corresponding 
genomic regions in M. tuberculosis and M. bovis. To investigate if these regions were 
associated with the ability of M. microti strains to infect humans, their genetic 
organization was studied in 8 additional M. microti strains, including those isolated 
recently from patients with pulmonary tuberculosis. This analysis identified some 
regions that were specifically absent from all tested M. microti strains, but present in all 
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other members of the M tuberculosis complex and other regions that were only absent 
from vole isolates of M. microti. 



2 A MATERIALS AND METHODS 

Bacterial strains and plasmids. M microti OV254 which was originally isolated from 
voles in the UK in the 1930 r s was kindly supplied by MJ Colston (45). DNA from M. 
microti OV216 and OV183 were included in a set of strains used during a multicenter 
study \26J7M. microtTMy^9^P2272 was isolated in 1988 from tihe perfusion fluicfof a 
41 -year-old dialysis patient (43) and was kindly provided by L. M. Parsons. M. microti 
35782 was purchased from American Type Culture Collection (designation TMC 1608 
(M.P. Prague)). M. microti Bl type llama, B2 type llama, B3 type mouse and B4 type 
mouse were obtained from the collection of the National Reference Center for 
Mycobacteria, Forschungszentrum Borstel, Germany. M bovis strain AF2 122/97, 
spoligotype 9 was responsible for a herd outbreak in Devon in the UK and has been 
isolated from lesions in both cattle and badgers. Typically, mycobacteria were grown on 
7H9 Middlebrook liquid medium (Difco) containing 10% oleic-acid-dextrose-catalase 
(Difco), 0.2 % pyruvic acid and 0.05% Tween 80. 

Library construction, preparation of BAC DNA and sequencing reactions. 
Preparation of agarose-embedded genomic DNA from M microti strain OV254, M 
tuberculosis H37Rv, M. bovis BCG was performed as described by Brosch et al. (5). The 
M. microti library was constructed by ligation of partially digested Hindm fragments 
(50-125 kb) into pBeloBACll. From the 10,000 clones that were obtained, 2,000 were 
picked into 96 well plates and stored at -80°C. Plasmid preparations of recombinant 
clones for sequencing reactions were obtained by pooling eight copies of 96 well plates, 
with each well containing an overnight culture in 250 jil 2YT medium with 12.5 jig.ml" 1 
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chloramphenicol. After 5 min centrifugation at 3000 rpm, the bacterial pellets were 
resuspended in 25 pi of solution A (25 mM Tris, pH 8.0, 50 mM glucose and 10 mM 
EDTA), cells were lysed by adding 25 pi of buffer B (NaOH 0.2 M, SDS 0.2%). Then 
20 ul of cold 3 M sodium acetate pH 4.8 were added and kept on ice for 30 min. After 

5 centrifugation at 3000 rpm for 30 min, the pooled supernatants (140 ul) were transferred 
to new plates. 130 pi of isopropanol were added, and after 30 min on ice, DNA was 
pelleted by centrifugation at 3500 rpm for 15 min. The supernatant was discarded and 
the pellet resuspended in 50 pi of a 10 pg/ml RNAse A solution (in Tris 10 mM pH 7.57 
EDTA 10 mM) and incubated at 64°C for 15 min. After precipitation (2.5 ul of sodium 

10 acetate 3 M pH 7 and 200 pi of absolute ethanol) pellets were rinsed with 200 pi of 70% 
ethanol, air dried and finally suspended in 20 pi of TE buffer. 

End-sequencing reactions were performed with a Tag DyeDeoxy Terminator cycle 
sequencing kit (Applied Biosystems) using a mixture of 13 pi of DNA solution, 2 pi of 

15 Primer (2 pM) (SP6-BAC1, AGTTAGCTCACTCATTAGGCA (SEQ ID No 7), or T7- 
BAC1, GGATGTGCTGCAAGGCGATTA (SEQ ID No 8)), 2.5 pi of Big Dye and 2.5 
pi of a 5X buffer (50 mM MgCl 2 , 50 mM Tris). Thermal cycling was performed on a 
PTC- 100 amplifier (MJ Inc.) with an initial denaturation step of 60 s at 95°C, followed 
by 90 cycles of 15 s at 95°C, 15 s at 56°c, 4 min at 60°C. DNA was then precipitated 

20 with 80 pi of 76% ethanol and centrifuged at 3000 rpm for 30 min. After discarding the 
supernatant, DNA was finally rinsed with 80 pi of 70% ethanol and resuspended in 
appropriate buffers depending on the type of automated sequencer used (ABI 377 or ABI 
3700). Sequence data were transferred to Digital workstations and edited using the TED 
software from the Staden package (37). Edited sequences were compared against the M. 

25 tuberculosis H37Rv database (http://genolist.pasteur.fr/TubercuList/), the M. bovis 
BLAST server flitt p://www.santter.ac.uk/Projects/M bovis/blast server.shtml ), and in- 
house databases to determine the relative positions of the M. microti OV254 BAC end- 
sequences. 
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Preparation of BAC DNA from recombinants and BAC digestion profile 
comparison. DNA for digestion was prepared as previously described (4). DNA (1 ug) 
was digested with HindlB. (Boehringer) and restriction products separated by pulsed-field 
5 gel electrophoresis (PFGE) on a Biorad CHEF-DR IE system using a 1% (w/v) agarose 
gel and a pulse of 3.5 s for 17 h at 6 V.cm" 1 . Low-range PFGE markers (NEB) were used 
as size standards. Insert sizes were estimated after ethidium bromide staining and 
visualization with UV light. Different comparisons were made with overlapping clones 
from the M. microti OV254, M. bovis AF2 122/97, and. M. tuberculosis H37Rv 
10 pBeloBACl 1 libraries. 



PCR analysis to determine presence of genes in different M. microti strains. 
Reactions contained 5 ul of lOxPCR buffer (100 mM B-mercaptoethanol, 600 mM Tris- 
HC1, pH 8.8, 20 mM MgCl 2 , 170 mM (NH^SO^, 20 mM nucleotide mix dNTP), 2:5 ul 
15 of each primer at 2 uM, 10 ng of template DNA, 10% DMSO and 0.5 unit of Taq 
polymerase in a final volume of 12.5 ul. Thermal cycling was performed on a PTC-100 
amplifier (MJ Inc.) with an initial denaturation step of 90 s at 95°C, followed by 35 
cycles of 45 s at 95°C, 1 min at 60°C and 2 min at 72°C. 

20 RFLP analysis. In brief, agarose plugs of genomic DNA prepared as previously 
described (5) were digested with either Ase\ Dral oxXbal (NEB), then electrophoresed 
on a 1% agarose gel, and finally transferred to Hybond-C extra nitrocellulose membranes 
(Amersham). Different probes were amplified by PCR from the M. microti strain OV254 
or M. tuberculosis H37Rv using primers for : 

25 esat-6 (esat-6F GTCACGTCCATTCATTCCCT (SEQ ID No 9); 
esat-6R ATCCCAGTGACGTTGCCTT) (SEQ ID No 10), 

the RDl raic flanking region (4340, 209F GCAGTGCAAAGGTGCAGATA (SEQ ID No 
1 1); 4354.701R GATTGAGACACTTGCCACGA (SEQ ID No 12)), or 
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plcA (plcA.intF CAAGTTGGGTCTGGTCGAAT (SEQ ID No 13); plcA.intR 
GCTACCCAAGGTCTCCTGGT (SEQ ID No 14)). Amplification products were radio- 
labeled by using the Stratagene Prime-It H kit (Stratagene). Hybridizations were 
performed at 65°C in a solution containing NaCl 0.8 M, EDTA pH 8, 5 mM, sodium 
phosphate 50 mM pH 8, 2% SDS, IX Denhardfs reagent and 100 ug/ml salmon sperm 
DNA (Genaxis). Membranes were exposed to phosphorimager screens and images were 
digitalized by using a STORM phospho-imager. 

DNA sequence accession numbers. The nucleotide sequences that flank MiDl, MiD2, 
MiD3 as well as the junction sequence of RDl roic have been deposited in the EMBL 
database. Accession numbers are AJ345005, AJ345006, AJ3 15556 and AJ3 15557, 
respectively. 



2.2 RESULTS 

Establishment of a complete ordered BAC library of M. microti OV254. 
Electroporation of pBeloBACll containing partial Hindm digests of M microti OV254 
DNA into" Escherichia coli DH10B yielded about 10,000 recombinant clones; from 
which 2,000 were isolated and stored in 96-well plates. Using the complete sequence of 
the At. tuberculosis H37Rv genome, as a scaffold, end-sequencing of 384 randomly 
chosen M. microti BAC clones allowed us to select enough clones to cover almost all of 
the 4.4 Mb chromosome. A few rare clones that spanned regions that were not covered 
by this approach were identified by PCR screening of pools as previously described (4). 
This resulted in a minimal set of 50 BACs, covering over 99.9% of the M. microti 
OV254 genome, whose positions relative to M. tuberculosis H37Rv are shown in Figure 
5. The insert size ranged between 50 and 125 kb, and the recombinant clones were 
stable. Compared with other BAC libraries fiom tubercle bacilli (4, 13) the M. microti 
OV254 BAC library contained clones that were generally larger than those obtained 
previously, which facilitated the comparative genomics approach, described below. 
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Identification of DNA deletions in M. microti OV254 relative to M. tuberculosis 
H37Rv by comparative genomics. The minimal overlapping set of 50 BAC clones, 
together with the availability of three other ordered BAC libraries from M tuberculosis 
H37Rv, M bovis BCG Pasteur 1173P2 (5, 13) and M. bovis AF2122/97 (14) allowed us 
to carry out direct BAC to BAC comparison of clones spanning the same genomic 
regions. Size differences of PFGE-separated HindUl restriction fiagments from M 
microti OV254 BACs, relative to restriction fragments from M. bovis and/or M 
tuberculosis BAC clones, identified loci that differed among the tested strains. Size 
variations of at least 2 Jcb^ were^easily detecta ble and 10 deleted regions, evenly, 
distributed around the genome, and containing more than 60 open reading frames 
(ORFs), were identified. These regions represent over 60 kb that are missing from M 
microti OV254 strain compared to M tuberculosis H37Rv. First, it was found that 
phiRv2 (RD11), one of the two M tuberculosis H37Rv prophages was present in M 
microti OV254, whereas phiRvl, also referred to as RD3 (29) was absent. Second, it was 
found that M. microti lacks four of the genomic regions that were also absent from M. 
bovisBCG. In fact, these four regions- of difference named RD7, RD8, RD9 and RDia 
are absent from all members of the M tuberculosis complex with the exception of M 
tuberculosis and M. canettii, and seem to have been lost from a common progenitor 
strain of M. africanum, M microti and Ml bovis (3). As such, our results obtained with 
individual BAC to BAC comparisons show that M. microti is part of this non-M 
tuberculosis lineage of the tubercle bacilli, and this assumption was further confirmed by 
sequencing the junction regions of RD7 - RD10 in M. microti OV254. The sequences 
obtained were identical to those from M africanum, M. bovis and M bovis BCG strains. 
Apart from these four conserved regions of difference, and phiRvl (RD3) M. microti 
OV254 did not show any other RDs with identical junction regions to M bovis BCG 
Pasteur, which misses at least 17 RDs relative to M tuberculosis H37Rv (1, 13, 35). 
However, five other regions missing from the genome of M. microti OV254 relative to 
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M tuberculosis H37Rv were identified (RDl mic , RD5 mic , MiDl, MiD2, MiD3). Such 
regions are specific either for strain OV254 or for M. microti strains in general. 
Interestingly, two of these regions, RDl mic , RD5 mic partially overlap RDs from the M. 
bovis BCG. 

5 

Antigens ESAT-6 and CFP-10 are absent from M. microti. One of the most 
interesting findings of the BAC to BAC comparison was a novel deletion in a genomic 
region close to the origin of replication (figure 5). Detailed PCR and sequence analysis 
of this region in M. microti OV254 showed a segment of 14 kb to be missing (equivalent 

10 to M. tuberculosis H37Rv from 4340,4 to 4354,5 kb) that partly overlapped ,RDl bcg 
absent from M. bovis BCG. More precisely, ORFs Rv3864 and Rv3876 are truncated in 
M. microti OV254 and ORFs Rv3865 to Rv3875 are absent (figure 6). This observation 
is particularly interesting as previous comparative genomic analysis identified RDl 1 * 8 as 
the only RD region that is specifically absent from all BCG sub-strains but present in all 

1 5 other members of the M. tuberculosis complex (1 , 4, 1 3, 29, 35). As shown in Figure 6, 
in M. microti OV254 the RDl mic deletion is responsible for the loss of a large portion of 
the conserved ESAT-6 family core region (40) including the genes coding for the major 
T-cell antigens ESAT-6 and CFP-10 (2, 15). The fact that previous deletion screening 
protocols employed primer sequences that were designed for the right hand portion of 

20 the RDl bce region (i.e. gene Rv3878) (6, 39) explains why the RDl mic deletion was not 
detected earlier by these investigations. Figure 6 shows that RDl mic does not affect genes 
Rv3877, Rv3878 and Rv3879 which are part of the RD1 bcg deletion. 

Deletion of phosptaolipase-C genes in M. microti OV254. RD5 mic , the other region 
25 absent from M. microti OV254, that partially overlapped an RD region from BCG, was 
revealed by comparison of BAC clone Mil8A5 with BAC Rvl43 (figure 5). PCR 
analysis and sequencing of the junction region revealed that RD5 mic was smaller than the 
RD5 deletion in BCG (Table 1 and 2 below). 
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TABLE 1. 

Description of the putative Amotion of the deleted and truncated ORFs in M. microti OV254 



Region Start - End overlapping ORF 



RD 10 
RD3 
RD 7 
RD 9 

.-RD^l. 



MiDl 
MiD2 
MiD3 

RD8 



264,5-266,5 
1779,5-1788,5 
2207,5-2220,5 
2330-2332 
-262-7,6-2633,4- 
3121,8-3126,6 

3554.0- 3755,2 

3741.1- 3755,7 

4056,8-4062,7 
4340,4-4354,5 



Rv0221-Rv0223 
Rvl573-R Y 1586 
Rvl964-Rvl977 
Rv2072-Rv2075 
— R*2348-R-v2-352- 
Rv2816-Rv2819 
Rv3187-Rv3190 
Rv3345-Rv3349 

Rv3617-Rv3618 
Rv3864-Rv3876 



Putative Function or family 
echAI 
bacteriophage proteins 
yrbE3A-3B; mce3A-F; unknown . 
• cobL; probable oxidoreductase; unknown 

7?/^^C^emberofPPE~fainiry 

1S61J0 transposase; unknown 
IS67/0 transposase; unknown 
members of the PE-PGRS and PPE families; insertion 
elements 

ephA\ IpqG; member of the PE-PGRS family 
member of the CBXX/CF QX family; member of the PE 
and PPE families; ESAT-6; CFP10; unknown 



TABLE 2. Sequence at the junction of the deleted regions in M microti OV254 



Junction Position ORFs 



Sequences at the junction 



Flanking primers 



CAAGACGAGGTTGTAAAACCTCGACG 

BDl mc 4340,421- Rv3864- CAGGATCGGCGATGAAATGCCAGTCG 4340,209F (SEQ ID No 1 1) 

(SEQID 4354,533 Rv3876 GCGTCGCTGAGCGCGCGCTGCGCC04 GCAGTGCAAAGGTGCAGATA 

No 15 > S. TCCCA TTTTGTCGCTGA TTTGTTTGAA CA 4354/701R (SEQ ID No 12) 

GCGA CGAA CCGGTGTTGAAAA TGTCGCCT GATTGAGACACTTGCCACGA 
GGGTCGGGGA TTCCCT 



CCTCGATGAACCACCTGACATGACCC 
RD5 TOC 2627,83 1 - Rv2349- CATCCTTTCCAAGAACTGGAGTCTCC 2627.370F (SEQ ID No 1 6) 

(SEQ ID 2635,581 Rv2355 GGACATCCCGGGGCGGTTCACTGCCC GAATGCCGACGTCATATCG 

No18 ) CAGGTGTCCTGGGTCGTTCCGTTGACCGT 2633.692R (SEQ ID No 17) 



• 
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CGA GTCCGAACA TCCGTCA TTCCCGGTGG 
CA GTCGGTGCGGTGA C 



CGGCCACTGAGTTCGATTAT 



MiDl 
(SEQID 
No21) 



MiD2 
(SEQ ID 
No 24) 



3121,880- 
3126,684 



CACCTGACATGACCCCATCCTTTCCA 
Rv2815c- AGAACTGGAGTCTCCGGACATGCCGG 
Rv281 8c GGCGGTTC AG GGi4 CA TTCA TGTCCA TCTT 

CTGGCA GA TCA GCA GA TCGCTTGTTCTCA G 
TGCA GGTGA GTC 



3121.690F (SEQ ID No 19) 
CAGCCAACACCAAGTAGACG 

3 126,924R (SEQ ID No 20) 
TCTACCTGCAGTCGCTTGTG 



3554,066- 
3555,259 



GCTGCCTACTACGCTCAACGCCAGAG 
Rv3 1 88- ACCAGCCGCCGGCTG AGGTCTCAGAT 
Rv3 1 89 CAGAGAGTCTCCGGACTCACCGGGGC 
GGTTCATAAA GGCTTCGA GA CCGGA CGG 
GCTGTA GGTTCCTCAA CTGTGTGGCGGA T 
GGTCTGA GCA CTTAA C . 



3553,880F (SEQ ID No 22) 
GTCCATCGAGGATGTCGAGT. 

3555,385R (SEQ ID No 23) 
CTAGGCCATTCCGTTGTCTG 



MiD3 
(SEQID 
No 27) 



3741,139- Rv3345c- 
3755,777 Rv3349c 



TGGCGCCGGCACCTCCGTTGCCACCG 
TTGCCGCCGCTGGTGGGCGCGGTGCC 
GTTCGCCCCGGCCGAACCGTTCAGGG 
CCGGGTTC GCCCTCA GCCG CTAAA CA CG 
CCGA CCAA GA TCAA CGA GCTA CCTGCC CG 
GTCAA GGTTGAA GA GCCCCCA TA TCA G CA 
AGGGCCCGGTGTCGGCG 



3740,950F (SEQ ID No 25) 

GGCGACGCCATTTCC 
3755,988R (SEQ ID No 26) 
AACTGTCGGGCTTGCTCTT 



10 



In fact, M. microti OV254 lacks the genes plcA, plcB, plcC and two specific PPE-protein 
encoding genes (Rv2352, Rv2353). This was confirmed by the absence of a clear band 
on a Southern blot of Asel digested genomic DNA from M. microti OV254 hybridized 
with a plcA probe. However, the genes Rv2346c and Rv2347c, members of the esat-6 
family, and Rv2348c, that are missing from M bovis and BCG strains (3) are still 
present in M microti OV254. The presence of an IS6U0 element in this segment 
suggests that recombination between two IS6U0 elements could have been involved in 
the loss of RD5 m, \ and this is supported by the finding that the remaining copy of 
IS6110 does not show a 3 base-pair direct repeat in strain OV254 (Table 2). 



Lack of MiDl provides genomic clue for M. microti OV254 characteristic 
spoligotype. MiDl encompasses the three ORFs Rv2816, Rv2817 and Rv2818 that 
encode putative proteins whose functions are yet unknown, and has occurred in the 
15 direct repeat region (DR), a polymorphic locus in the genomes of the tubercle bacilli that 
contains a cluster of direct repeats of 36 bp, separated by unique spacer sequences of 36 
to 41 bp (17), (figure 7). The presence or absence of 43 unique spacer sequences that 
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intercalate the DR sequences is the basis of spacer-oligo typing, a powerful typing 
method for strains from the M. tuberculosis complex (23). M. microti isolates exhibit a 
characteristic spoligotype with an unusually small DR cluster, due to the presence of 
only spacers 37 and 38 (43). In M microti OV254, the absence of spacers 1 to 36, which 
5 are present in many other M tuberculosis complex strains, appears to result from an 
JS6110 mediated deletion of 636 bp of the DR region. Amplification and PvuU 
restriction analysis of a 2.8 kb fragment obtained with primers located in the genes that 
flank the DR region (Rv2813c and Rv2819) showed that there is only one copy of 
1S61J0 remaining in this region (figure 7). This IS6110 element is inserted into ORF 
10 Rv2 8I9 at position 3,1 19,932 relat ive to the M. tuberculosis H37Rv_ g gnnTp* _As-for-- 
other IS6J10 elements that result from homologous recombination between two copies 
(7), no 3 base-pair direct repeat was found for this copy of IS6110 in the DR region. 
Concerning the absence of spacers 39-43 (figure7), it was found that M. microti showed 
a slightly different organization of this locus than M. bovis strains, which also 
15 characteristically lack spacers 39-43. In M microti OV254 an extra spacer of 36 bp was 
found that was not present in M bovis nor in M. tuberculosis H37Rv. The sequence of 
this specific: spacer was identical to* that of spacer 58 reported by van Embden and 
colleagues (42). In their study of the DR region in many strains from the M. tuberculosis 
complex this spacer was only found in M. microti strain NLA000016240 (AF1 89828) 
20 and in some ancestral M tuberculosis strains (3, 42). Like MiDl, MiD2 most probably 
results from an IS6//0-mediated deletion of two genes (Rv3188, Rv3189) that encode 
putative proteins whose function is unknown (Table 2 above and Table 3 below). 

TABLE 3. Presence of the RD and MiD regions in different M. microti strains 



HOST VOLES HUMAN 



Strain OV254 OV183 OV216 ATCC Myc 94 B3 B4 type ~B1 B2 

35782 -2272 type mouse mouse type llama type llama 
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absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


RD3 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


RD7 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


RD8 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


RD9 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


RD 10 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


absent 


MiD3 


absent 


ND 


ND 


absent 


absent 


absent 


absent 


absent 


absent 


MiDl 


absent 


ND 


ND 


present 


partial 


partial . 


partial 


present 


present 




absent 


absent 


absent 


present 


present 


present 


present 


present 


present 


MiD2 


absent 


ND 


ND 


present 


present 


present 


present 


present 


present 



ND, not determined 



Absence of some members of the PPE family in M. microti. MiD3 was identified by 
the absence of two Hindm sites in BAC Mi4B9 that exist at positions 3749 kb and 3754 

5 kb in the M. tuberculosis H37Rv chromosome. By PCR and sequence analysis, it was 
detennined that MiD3 corresponds to a 12 kb deletion that has truncated or removed five 
genes orthologous to Rv3345c-Rv3349c. Rv3347c encodes a protein of 3157 amino- 
acids that belongs to the PPE family and Rv3346c a conserved protein that is also 
present in M. leprae. The function of both these putative proteins is unknown while 

10 Rv3348 and Rv3349 are part of an insertion element (Table 1). At present, the 
consequences of the MiD3 deletions for the biology of M. microti remains entirely 
unknown. 

Extra-DNA in M. microti OV254 relative to M. tuberculosis H37Rv. M. microti 
15 OV254 possesses the 6 regions RvDl to RvD5 and TBD1 that are absent from the 
sequenced strain M. tuberculosis H37Rv, but which have been shown to be present in 
other members of the M. tuberculosis complex, like M. caneltii, M. africanum, M. bovis, 
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and M. bovis BCG (3, 7, 13). In M. tuberculosis H37Rv, four of these regions (RvD2-5) 
contain a copy of JS6110 which is not flanked by a direct repeat, suggesting that 
recombination of two 1S6IJ0 elements was involved in the deletion of the intervening 
genomic regions (7). In consequence, it seems plausible that these regions were deleted 
5 from the M. tuberculosis H37Rv genome rather than specifically acquired by M microti. 
In addition, three other small insertions have also been found and they are due to the 
presence of an 1S6II0 element in a different location than in M. tuberculosis H37Rv and 
M. bovis AF2 122/97. Indeed, PvuYL RFLP analysis of M microti OV254 reveals 13 
IS6110 elements (data not shown). 

1-0— — 

Genomic diversity of Af. microti strains. In order to obtain a more global picture of the 
genetic organization of the taxon M. microti we evaluated the presence or absence of the 
variable regions found in strain OV254 in eight other M. microti strains. These strains 
which were isolated from humans and voles have been designated as M. microti mainly 
15 on the basis of their specific spoligotype (26, 32, 43) and can be further divided into 
subgroups according to the host such as voles, llama and humans (Table 3). As stated in 
the introduction, M. microti is rarely found in humans unlike M tuberculosis. So the 
availability of 9 strains from variable sources for genetic characterization is an 
exceptional resource. Among them was one strain (Myc 94-2272) from a severely 
20 immunocompromised individual (43), and four strains were isolated from HIV-positive 
or HIV-negative humans with spoligotypes typical of llama and mouse isolates. For one 
strain, ATCC 35872 / M.P. Prague, we could not identify with certainty the original host 
from which the strain was isolated, nor if this strain corresponds to M. microti OV166, 
that was received by Dr. Sula from Dr. Wells and used thereafter for the vaccination 
25 program in Prague in the 1 960's (38). 

First, we were interested if these nine strains designated as M. microti on the basis of 
their spoligotypes also resembled each other by other molecular typing criteria. As RFLP 
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of pulsed-field gel separated chromosomal DNA represents probably the most accurate 
molecular typing strategy for bacterial isolates, we determined the Asel profiles of the 
available M. microti strains, and found that the profiles resembled each other closely but 
differed significantly from the macro-restriction patterns of M. tuberculosis, M. boyis 
and M. bovis BCG strains used as controls. However, as depicted in Figure 8A, the 
patterns were not identical to each other and each M. microti strain showed subtle 
differences, suggesting that they were not epidemiologically related. A similar 
observation was made with other rare cutting restriction enzymes, like 2>al or Xbal (data 
not shown). 



Common and diverging features of M. microti strains. Two strategies were used to 
test foT the presence or absence of variable regions in these strains . for which we do not 
have ordered BAC libraries. First, PCRs using internal and flanking primers of the 
variable regions were employed and amplification products of the junction regions were 

15 sequenced. Second, probes from the internal portion of variable regions absent from M. 
microti OV254 were obtained by amplification of M. tuberculosis H37Rv DNA using 
specific primers. Hybridization with these radio-labeled probes was carried out on blots 
from PFGE separated Asel restriction digests of the M. microti strains. In addition, we 
confirmed the findings obtained by these two techniques by using a focused macro-array, 

20 containing some of the genes identified in variable regions of the tubercle bacilli to date 
(data not shown). 

This led to the finding that the RDl"" 16 deletion is specific for all M. microti strains 
tested. 

25 Indeed, none of the M. microti DNA-digests hybridized with the radio-labeled esat-6 
probe (Fig. 8B) but with the RDl mic flanking region (Fig. 8C). hi addition, PGR 
amplification using primers flanking the RDl mic region (Table 2) yielded fragments of 
the same size for Af. microti strains whereas no products were obtained for M. 
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tuberculosis, M. bovis and M bovis BCG strains (Fig. 9). Furthermore, the sequence of 
the junction region was found identical among the strains which confirms that the 
genomic organization of the RDl mic locus was the same in all tested M. microti strains 
(Table 3). This clearly demonstrates that M microti lacks the conserved ESAT-6 family 
core region stretching in other members of the M tuberculosis complex from Rv3864 to 
Rv3876 and, as such, represents a taxon of naturally occurring ESAT-6 / CFP-10 
deletion mutants. 

Like RDl mic , MiD3 was found to be absent from all nine M microti strains tested and, 
therefore,-appears-to-be-a-speeifiG^ 

(Table 3). However, PCR amplification showed that RD5 mlc is absent only from the vole 
isolates OV254, OV216 and OV183, but present in the M microti strains isolated from 
human and other origins (Table 3). This was confirmed by the presence of single bands 
but of differing sizes on a Southern blot hybridized with a plcA probe for all Af. microti 
tested strains except OV254 (Fig. 8D). Interestingly, the presence or absence of RD5 mic 
correlated with the similarity of IS6110 RFLP profiles. The profiles of the three Af. 
microti strains isolated from voles in the UK differed considerably from the IS6110 
RFLP patterns of humans isolates (43). Taken together, these results underline the 
proposed involvement of 1S6110 mediated deletion of the RD5 region and further 
suggest that RD5 may be involved in the variable potential of M microti strains to cause 
disease in humans. Similarly, it was found that MiDl was missing only from the vole 
isolates OV254, OV216 and OV183, which display the same spoligotype (43), 
confirming the observations that MiDl confers the particular spoligotype of a group of 
M microti strains isolated from voles. In contrast, PCR analysis revealed that MiDl is 
only partially deleted from strains B3 and B4 both characterized by the mouse 
spoligotype and the human isolate Af. microti Myc 94-2272 (Table 3). For strain ATCC 
35782 deletion of the MiDl region was not observed. These findings correlate with the 
described spoligotypes of the different isolates, as strains that had intact or partially 
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deleted MiDl regions had more spacers present than the vole isolates that only showed 
spacers 37 and 38. 



1 % COMMENTS AN n mSCUSSION 

5 We have searched for major genomic variations, due to insertion-deletion events, 
between the vole pathogen, M. microti, and the human pathogen, M tuberculosis. BAC 
based comparative genomics led to the identification of 10 regions absent from the 
genome of the vole bacillus M. microti OV254 and several insertions due to IS67/0. 
Seven of these deletion regions were also absent from eight other M. microti strains, 

1 o isolated from voles or humans, and they account for more than 60 kb of genomic DNA. 
Of these regions, RDl mic is of particular interest, because absence of part of this region 
has been found to be restricted to the BCG vaccine strains to date. As M. microti was 
originally described as non pathogenic for humans, it is proposed here that RD1 genes is 
involved in the pathogenicity for humans. This is reinforced by the fact that RDl bcg (29) 

15 has lost putative ORFs belonging to the esat-6 gene cluster including the genes encoding 
ESAT-6 and CFP-10 (Fig. 6) (40). Both polypeptides have been shown to act as potent 
stimulators of the immune system and are antigens recognized during the early stages of 
infection (8, 12, 20, 34). Moreover, the biological importance of this RD1 region for 
mycobacteria is underlined by the fact that it is also conserved in M. leprae, where genes 

20 ML0047-ML0056 show high similarities in their sequence and operon organization to 
the genes in the esat-6 core region of the tubercle bacilli (1 1). In spite of the radical gene 
decay observed in M. leprae the esat-6 operon apparently has kept its functionality in 
this organism. 

25 However, the RD1 deletion may not be the only reason why the vole bacillus is 
attenuated for humans. Indeed, it remains unclear why certain M. microti strains included 
in the present study that show exactly the same RDl mic deletion as vole isolates, have 
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been found as causative agents of human tuberculosis. As human M microti cases are 
extremely rare, the most plausible explanation for this phenomenon would be that the 
infected people were particularly susceptible for mycobacterial infections in general. 
This could have been due to an immunodeficiency (32, 43) or to a rare genetic host 
predisposition such as interferon gamma- or H^12 receptor modification (22). 

In addition, the finding that human M. microti isolates differed from vole isolates by the 
presence of region RD5 mlc may also have an impact on the increased potential of human 
M. microti isolates to cause disease. Intriguingly, BCG and the vole bacillus lack 
__overl^ping_po^ 

plcC) of the four genes encoding phospholipase C (PLC) in M tuberculosis. PLC has 
been recognized as an important virulence factor in numerous bacteria, including 
Clostridium perfringens, Listeria monocytogenes and Pseudomonas aeruginosa, where 
it plays a role in cell to cell spread of bacteria, intracellular survival, and cytolysis (36, 
41). To date, the exact role of PLC for the tubercle bacilli remains unclear. plcA encodes 
the antigen mtp40 which has previously been shown to be absent from seven tested vole 
and hyrax isolates (28). Phospholipase C activity in M. tuberculosis, M, microti and Mi 
bovis, but not in M bovis BCG, has been reported (21, 47). However, PLC and 
sphingomyelinase activities have been found associated with the most virulent 
mycobacterial species (21). The levels of phospholipase C activity detected in M. bovis 
were much lower than those seen in M. tuberculosis consistent with the loss of plcABC. 
It is likely, that plcD is responsible for the residual phospholipase C activity in strains 
lacking RD5, such as M bovis and M microti OV254. Indeed, the plcD gene is located 
in region RvD2 which is present in some but not all tubercle bacilli (13, 18). 
Phospholipase encoding genes have been recognized as hotspots for integration of 
1S61J0 and it appears that the regions RD5 and RvD2 undergo independent deletion 
processes more frequently than any other genomic regions (44). Thus, the virulence of 
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some M. microti strains may be due to a combination of functional phospholipase C 
encoding genes (7, 25, 26, 29). 

Another intriguing detail revealed by this study is that among the deleted genes seven 
5 code for members of the PPE family of Gly-, Ala-, Asn-rich proteins. A closer look at 
the sequences of these genes showed that in some cases they were small proteins with 
unique sequences, like for example Rv3873, located in the RDl mic region, or Rv2352c 
and Rv2353c located in the RD5 mic region. Others, like Rv3347c, located in the MiD3 
region code for a much larger PPE protein (3157 aa). In this case a neighboring gene 
10 (Rv3345c), belonging to another multigene family, the PE-PGRS family, was partly 
affected by the MiD3 deletion. While the function of the PE/PPE proteins- is currently 
unknown, their predicted abundance in the proteome of M. tuberculosis suggests that 
they may play an important role in the. life cycle of the tubercle bacilli. Indeed, recently 
some of them were shown to be involved in the pathogenicity of M. tuberculosis strains 
1 5 (9). Complementation of such genomic regions in M. microti OV254 should enable us. to 
carry out proteomics and virulence studies in animals in order to understand the role of 
such ORFs in pathogenesis. 

In conclusion, this study has shown that M. microti, a taxon originally named after its 
20 major host Microtus agrestis, the common vole, represents a relatively homogenous 
group of tubercle bacilli. Although all tested strains showed unique PFGE macro- 
restriction patterns that differed slightly among each other, deletions that were common 
to all M. miaoti isolates (RD7-RD10, MiD3, RDl mlc ) have been identified. The 
conserved nature of these deletions suggests that these strains are derived from a 
25 common precursor that has lost these regions, and their loss may account for some of the 
observed common phenotypic properties of M. microti, like the very slow growth on 
solid media and the formation of tiny colonies. This finding is consistent with results 
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from a recent study that showed that M. microti strains carry a particular mutation in the 
gyr5gene(31). 

Of particular interest, some of these common features (e*g. the flanking regions of 
RDl mlc , or MiD3) could be exploited for an easy-to-perform PCR identification test, 
similar to the one proposed for a range of tubercle bacilli (33). This test enables 
unambiguous and rapid identification of M microti isolates in order to obtain a better 
estimate of the overall rate of M. microti infections in humans and other mammalian 
species. 
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CLAIMS 



A strain of M. bovis BCG or M. microti, wherein said strain has integrated all or part 
of the RD1 region responsible for enhanced immunogenic^ and increased 
persistence of BCG to the tubercle bacilli. 

A strain according to claim 1 which has integrated a portion of DNA originating 
from Mycobacterium tuberculosis, which comprises at least one gene selected from 
RV3872 (SEQ ED No 2, mycobacterial PE), Rv3873 (SEQ ID No 3, PPE), Rv3874 ; ; 
(SEQIDNo4 > CFP-10),andRv3875 (SEQ ID No 5, ESAT-6). 

A strain according to claim 1 which has integrated a portion of DNA originating 
from Mycobacterium tuberculosis, which comprises Rv3875 (SEQ ID No 5, ESAT- 

6). 

- • ' ... ; s 

A strain according to claim 1 which has integrated a portion of DNA originating 
from Mycobacterium tuberculosis, which comprises Rv3874 (SEQ ID No 4, CFP. 
10). 

A strain according to claim 1 which has integrated a portion of DNA originating 
from Mycobacterium tuberculosis, which comprises both Rv3875 (SEQ ID No 5, 
ESAT-6) and (SEQ ID No 4, CFP-10). 

. A strain according to one of claims 2 to 5, wherein the coding sequence of the 
integrated gene is in frame with its natural promoter or with an exogenous promoter, 
such as a promoter capable of directing high level of expression of said coding 
sequence. 
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7. A strain according to one of claims 1 to 5, wherein said the integrated gene is 
mutated so as to maintain the improved immunogenicity while decreasing the 
virulence of the strain. 

5 8. A strain according to claim 7, wherein said strain only carries parts of the genes 
coding for ESAT-6 or CFP-10 in a mycobacterial expression vector under the 
control of a promoter, more particularly an hsp60 promoter. 

9. A strain according to claim 8, wherein said strain carries at least one portion of the 
10 _ esat-6 gene that codes for i mmunogen ic 20-mer peptides of ESATrplactiYe.as.XicelL- 

epitopes. 

10. A strain according to claim 7, wherein the esat-6 and CFP-10 encoding genes are 
altered by directed mutagenesis in a way that most of the immunogenic peptides of 

1 5 ESAT-6 remain intact, but the biological functionality of ESAT-6 is lost. 

1 1. M. bovis BCG::RD1 strains which have integrated a cosmid herein referred as POD 1- 
2F9 and RD1-AP34 contained in the E. coli strains deposited at the CNCM under 
the accession number 1-2831 and 1-2832 respectively. 
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12. M. bovis BCG::RD1 strain which has integrated the construct RD1-AP34 which 
contains a 3909 bp fragment of the M. tuberculosis H37Rv genome from region 
4350459 bp to 4354367 bp cloned (SEQ ID No 1). 
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13. M. bovis BCG::RD1 strain which has integrated the fragment RD1-2F9 (~ 32 kb) 
that covers the region of the M. tuberculosis genome AL123456 from ca 4337 kb to 
ca. 4369 kb. 
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14. M. microti::KDl strain which has integrated the construct RD1-AP34 which 
contains a 3909 bp fragment of the M. tuberculosis H37Rv genome from region 
4350459 bp to 4354367 bp cloned (SEQ ID No 1). 

5 15. M. microtivXD\ strain which has integrated the fragment RD1-2F9 (~ 32 kb) that 
covers the region of the M. tuberculosis genome AL123456 from ca 4337 kb to ca. 
4369 kb. 

1 6. A method for preparing and selecting improved M. bovis BCG or M. microti strains 
defined in one of claims 1 to 15 comprising a step consisting of modifying said 

10 strains by insertion, deletion or mutation in the integrated DR1 region, more 

particularly in the esat-6 or CFP-10 gene, said method leading to strains that are less 
virulent for immuno-depressed individuals. 

17. A cosmid or a plasmid comprising all or part of the RD1 region originating from 
Mycobacterium tuberculosis, said region comprising at least one gene selected from 

15 Rv3872 (mycobacterial PE), Rv3873 (PPE), Rv3874 (CFP-10), and Rv3875 (ESAT- 

6). " 

18. A cosmid or a plasmid according to claim 17 comprising CFP-10, ESAT-6 or both 
or a part of them. 

19. A cosmid or a plasmid according to claim 18 comprising a mutated gene selected 
20 CFP-10, ESAT-6 or both., said mutated gene being responsible for the improved 

immunogenicity and decreased virulence. 

20. Use of a cosmid or a plasmid according to one of claims 17 to 19 for transforming 
M. bovis BCG or M. microti. 
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21. A pharmaceutical composition comprising a strain according to one of claims 1 to 
15 and a pharmaceutical^ acceptable carrier. 

22. A pharmaceutical composition according to claim 21 containing suitable 
pharmaceutically-acceptable carriers comprising excipients and auxiliaries which 
facilitate processing of the living vaccine into preparations which can be used 
pharmaceutically. 

23. A pharmaceutical composition according to claim 21 or 22 which is suitable for 
intravenous or subcutaneous administration. 

24. A vaccine comprising a strain according to one of claims 1 to 15 and a suitable 
carrier. 

25. A product comprising a strain according to one of claims 1 to 15 and at least one 
protein selected from ESAT-6 and CFP-10 or epitope derived thereof for a separate, 
simultaneous or sequential use for treating tuberculosis. 

26. The use of a strain according to one of claims 1 to 15 for preparing a medicament or 
a vaccine for preventing or treating tuberculosis. 

27. The use of a strain according to one of claims 1 to 15 as an 
adjuvant/immunomodulator for preparing a medicament for the treatment of 
superficial bladder cancer. 

28. A method for the identification at the species level of members of the M. 
tuberculosis complex by means of markers for RDl mic and RD5 mic as molecular 
diagnostic test. 



29. A method according to claim 28 comprising the use of a primer selected from : 
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10 



15 



20 
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primer esat-6F GTCACGTCCATTCATTCCCT (SEQ ID No 9), 

primer esat-6R ATCCCAGTGACGTTGCCTT) (SEQ E> No 10), 

primer RDl mic flanking region F GCAGTGCAAAGGTGCAGATA (SEQ ID No 1 1), 

primer RDl mic flanking region R GATTGAGACACTTGCCACGA (SEQ ID No 12), 

primer RD5 mic flanking region F GAATGCCGACGTCATATCG (SEQ ID No 16), 

primer RD5"* flanking region R CGGCCACTGAGTTCGATTAT (SEQ ID No 17) 

and the complementary sequences of said primers. 

30. A diagnostic kit for the identification at the species level of members of the M 
tuberculosis comprising DNA probes and primers specifically hybridizing to a DNA 
portion of the RD1 or RD5 region of M. tuberculosis, more particularly probes 
hybridizing under stringent conditions to a gene selected from Rv3871, Rv3872 
(mycobacterial PE), Rv3873 (PPE), Rv3874 (CFP-10), Rv3875 (ESAT-6), and Rv3876, 
preferably CFP- 1 0 and ESAT-6. 

31. A diagnostic kit according to claim 30 comprising a probe or primer selected from : 
esat-6F GTCACGTCCATTCATTCCCT (SEQ ID No 9), 

esat-6R ATCCCAGTGACGTTGCCTT) (SEQ ID No 10), 
RDl™ flanking region F GCAGTGCAAAGGTGCAGATA (SEQ ID No 1 1), 
RDl mic flanking region R GATTGAGACACTTGCCACGA (SEQ ID No 12), 
RD5 mic flanking region F GAATGCCGACGTCATATCG (SEQ ID No 16), 
RD5™ flanking region R CGGCCACTGAGTTCGATTAT (SEQ ID No 17). 

32. A diagnostic kit for the identification at the species level of members of the M. 
tuberculosis comprising antibodies directed to mycobacterial PE, PPE, CFP-10 and 
ESAT-6. 

33. Virulence markers associated with RD1 and/or RD5 regions of the genome of M 
tuberculosis or a part of these regions. 
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ABSTRACT 



The present invention relates to a strain of M bovis BCG or M. microti, wherein said 
strain has integrated part or all of the RD1 region responsible for enhanced 
immunogenic^ of the tubercle bacilli, especially the ESAT-6 and CFP-10 genes. These 
strains will be referred as the M. bovis BCG::RD1 or M. microti::KDl strains and are 
useful as a new improved vaccine for preventing tuberculosis and as a therapeutical 
product enhancing the stimulation of the immune system for the treatment of bladder 
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SEQUENCE LISTING EROVISOIKE 
RD1-AP34 {a 3909 bp fragment of the M, tuberculosis H37Rv genome) 

gaattcccatccagtgagttcaaggtcaagcggcgcccccctggccaggcatttctcgtctcgccagacggcaaa 
gaggtcatccaggccccctacatcgagcctccagaagaagtgttcgcagcacccccaagcgccggttaagattat 
ttcattgccggtgtagcaggacccgagctcagcccggtaatcgagttcgggcaatgctgaccatcgggtttgttt 
ccggctataaccgaacggtttgtgtacgggatacaaatacagggagggaagaagtaggcaaatggaaaaaatgtc 
acatgatccgatcgctgccgacattggcacgcaagtgagcgacaacgctctgcacggcgtgacggccggctcgac 
ggcgctgacgtcggtgaccgggctggttcccgcgggggccgatgaggtctccgcccaagcggcgacggcgttcac 
atcggagggcatccaattgctggcttccaatgcatcggcccaagaccagctccaccgtgcgggcgaagcggtcca 
ggacgtcgcccgcacctattcgcaaatcgacgacggcgccgccggcgtcttcgccgaataggcccccaacacatc 
ggagggagtgatcaccatgctgtggcacgcaatgccaccggagctaaataccgcacggctgatggccggcgcggg 
tccggctccaatgcttgcggcggccgcgggatggcagacgctttcggcggctctggacgctcaggccgtcgagtt 
gaccgcgcgcctgaactctctgggagaagcctggactggaggtggcagcgacaaggcgcttgcggctgcaacgcc 
gatggtggtctggctacaaaccgcgtcaacacaggccaagacccgtgcgatgcaggcgacggcgcaagccgcggc 
atacacccaggccatggccacgacgccgtcgctgccggagatcgccgccaaccacatcacccaggccgtccttac 
ggccaccaacttcttcggtatcaacacgatcccgatcgcgttgaccgagatggattatttcatccgtatgtggaa 
ccaggcagccctggcaatggaggtctaccaggccgagaccgcggttaacacgcttttcgagaagctcgagccgat 
ggcgtcgatccttgatcccggcgcgagccagagcacgacgaacccgatcttcggaatgccctcccctggcagctc 
aacaccggttggccagttgccgccggcggctacccagaccctcggccaactgggtgagatgagcggcccgatgca 
gcagctgLccagccgctgcagcaggtgacgtcgttgttcagccaggtgggcggcaccggcggcggcaacccagc 
Lacgaggaagccgcgcagatgggcctgctcggcaccagtccgctgtcgaaccatccgctggctggtggatcagg 
ccccagcgcgggcgcgggcctgctgcgcgcggagtcgctacctggcgcaggtgggtcgttgacccgcacgccgct 
gatgtctcagctgatcgaaaagccggttgccccctcggtgatgccggcggctgctgccggatcgtcggcgacggg 
Iggcgccgctccggtgggtgcgggagcgatgggccagggtgcgcaatccggcggctccaccaggccgggtctggt 



tcccgtaatgacaacagacttcccggccacccgggccggaagacttgccaacartr^ggcgagg^gg^^aoyoy 
agaaagtagtccagcatggcagagatgaagaccgatgccgctaccctcgcgcaggaggcaggtaatttcgagcgg 
atctccggcgacctgaaaacccagatcgaccaggtggagtcgacggcaggttcgttgcagggccagtggcgcggc 
gcggcgSggacggccgcccaggccgcggtggtgcgcttccaagaagcagccaataagcagaagcaggaactcgac 
gaStctcgacgaatattcgtcaggccggcgtccaatactcgagggccgacgaggagcagcagcaggcgctgtcc 
tcgcaaatgggcttctgacccgctaatacgaaaagaaacggagcaaaaacatgacagagcagcagtggaatttcg 
cg^tatcglggccgclgcaalcgcaatccagggaaatgtcacgtccattcattccctccttgacgaggggaagc 
agtccctgLcLgctcgcagcggcctggggcggtagcggttcggaggcgtaccagggtgtccagcaaaaatggg 

acgccacggctaccgagctgaacaacgcgctgc^^ 

-JZ ^^,^~=^« 1 .^ = .^ 1 -r,^ a frrt-t:cacataaaacaacacGgagttcgcgtagaatagcgaaacacggg 




catggcggccgactacgacaagctcttccggccgcacgaaggtatggaagctccggacgatatggcagcgcagcc 
gttcttcgaccccagtgcttcgtttccgccggcgcccgcatcggcaaacctaccgaagcccaacggccagactcc 
gcccccgacgtccgacgacctgtcggagcggttcgtgtcggccccgccgccgccacccccacccccacctccgcc 
Lcgccaactccgatgccgatcgccgcaggagagccgccctcgccggaaccggccgcatctaaaccacccacacc 
ccccatgcccatcgccggacccgaaccggccccacccaaaccacccacaccccccatgcccatcgccggacccga 
accggccccacccaaaccacccacacctccgatgcccatcgccggacctgcacccaccccaaccgaatcccagtt 




accctcgcacgqgccacatcaaccccggcgcaccgcawayuaui.sv.v.v t y 3a ^«- 3 — „ ^ * 

cccgcccgctccgtccagaccgtctgcgtccccggccgaaccaccgacccggcctgccccccaacactcccgacg 
tgcgcgccggggtcaccgctatcgcacagacaccgaacgaaacgtcgggaaggtagcaactggtccatccatcca- 
gacgcggctgcgggcagaggaagcatccggcgcgcagctcgcccccggaacggagccctcgccagcgccgttggg 
ccaaccgagatcgtatctggctccgcccacccgccccgcgccgacagaacctccccccagcccctcgccgcagcg 
caactccggtcggcgtgccgagcgacgcgtccaccccgatttagccgcccaacatgccgcggcgcaacctgattc 
aattacggccgcaaccactggcggtcgtcgccgcaagcgtgcagcgccggatctcgacgcgacacagaaatcctt 
aaggccggcggccaaggggccgaaggtgaagaaggtgaagccccagaaaccgaaggccacgaagccgcccaaagt 
gg?gtcgcagcgcggctggcgacattgggtgcatgcgttgacgcgaatcaacctgggcctgtcacccgacgagaa 

glacgagctggacSgca^ 

aggtggSgctggcaaaaccacgctgacagcagcgttggggtcgacgttggctcaggtgcgggccgaccggatcct 
ggctctaga 



PE coding sequence (SEQ ID No 2) 

atggaaaaaatgtcacatgatccgatcgctgccgacattggcacgcaagtgagcgacaacgctctgcacggcgtg 
acggccggctcgacggcgctgacgtcggtgaccgggctggttcccgcgggggccgatgaggtctccgcccaagcg 
gcgacggcgttcacatcggagggcatccaattgctggcttccaatgcatcggcccaagaccagctccaccgtgcg 
ggcgaagcggtccaggacgtcgcccgcacctattcgcaaatcgacgacggcgccgccggcgtcttcgccgaa 

PPE coding sequence (SEQ ID No 3) 

atgctgtggcacgcaatgccaccggagctaaataccgcacggctgatggccggcgcgggtccggctccaatgctt 
gcggcggccgcgggatggcagacgctttcggcggctctggacgctcaggccgtcgagttgaccgcgcgcctgaac 
tctctgggagaagcctggactggaggtggcagcgacaaggcgcttgcggctgcaacgccgatggtggtctggcta 
caaaccgcgtcaacacaggccaagacccgtgcgatgcaggcgacggcgcaagccgcggcatacacccaggccatg 
gccacgacgccgtcgctgccggagatcgccgccaaccacatcacccaggccgtccttacggccaccaacttcttc 
ggtatcaacacgatcccgatcgcgttgaccgagatggattatttcatccgtatgtggaaccaggcagccctggca 
atggaggtctaccaggccgagaccgcggttaacacgcttttcgagaagctcgagccgatggcgtcgatccttgat 
cccggcgcgagccagagcacgacgaacccgatcttcggaatgccctcccctggcagctcaacaccggttggccag 
ttgccgccggcggctacccagaccctcggccaactgggtgagatgagcggcccgatgcagcagctgacccagccg 



cagatgggcctgctcggcaccagtccgctgtcgaaccatccgctggctggtggatcaggccccagcgcgggcgcg 
ggcctgctgcgcgcggagtcgctacctggcgcaggtgggtcgttgacccgcacgccgctgatgtctcagctgatc 
gaaaagccggttgccccctcggtgatgccggcggctgctgccggatcgtcggcgacgggtggcgccgctccggtg 
ggtgcgggagcgatgggccagggtgcgcaatccggcggctccaccaggccgggtctggtcgcgccggcaccgctc 
gcgcaggagcgtgaagaagacgacgaggacgactgggacgaagaggacgactgg 

CFP-10 coding sequence (SEQ ID No 4) 

atggcagagatgaagaccgatgccgctaccctcgcgcaggaggcaggtaatttcgagcggatctccggcgacctg 
aaaacccagatcgaccaggtggagtcgacggcaggttcgttgcagggccagtggcgcggcgcggcggggacggcc 
gcccaggccgcggtggtgcgcttccaagaagcagccaataagcagaagcaggaactcgacgagatctcgacgaat 
attcgtcaggccggcgtccaatactcgagggccgacgaggagcagcagcaggcgctgtcctcgcaaatgggcttc 



ESAT-6 coding sequence (SEQ ID No 5) 

Atgacagagcagcagtggaatttcgcgggtatcgaggccgcggcaagcgcaatccagggaaatgtcacgtccatt 
cattccctccttgacgaggggaagcagtccctgaccaagctcgcagcggcctggggcggtagcggttcggaggcg 
taccagggtgtccagcaaaaatgggacgccacggctaccgagctgaacaacgcgctgcagaacctggcgcggacg 
atcagcgaagccggtcaggcaatggcttcgaccgaaggcaacgtcactgggatgttcgca 

CFP-10 + ESAT-6 (SEQ ID No 6) 

atggcagagatgaagaccgatgccgctaccctcgcgcaggaggcaggtaatttcgagcggat 
ctccggcgacctgaaaacccagatcgaccaggtggagtcgacggcaggttcgttgcagggcc 
agtggcgcggcgcggcggggacggccgcccaggccgcggtg 

gtgcgcttccaagaagcagccaataagcagaagcaggaactcgacgagatctcgacgaatat 
tcgtcaggccggcgtccaatactcgagggccgacgaggagcagcagcaggcgctgtcctcgc 
aaatgggcttctgacccgctaatacgaaaagaaacggagcaaaaacatgacagagcagcagt 
ggaatttcgcgggtatcgaggccgcggcaagcgcaatccagggaaatgtcacgtccattcat 
tccctccttgacgaggggaagcagtccctgaccaagctcgcagcggcctggggcggtagcgg 




aacgtcactgggatgttcgca 
Primer SP6-BAC1 

AGTTAGCTCACTCATTAGGCA (SEQ ID No7) 
Primer T7-BAC1 

GGATGTGCTGCAAGGCGATTA (SEQ ID No8) 
primer esat-6F 

GTCACGTCCATTCATTCCCT (SEQ ID Bo 9); 
primer esat-6R 

ATCCCAGTGACGTTGCCTT) (SEQ ID No 10), 

primer RDl 1 " 10 flanking region F 
GCAGTGCAAAGGTGCAGATA (SEQ ID No 11); 

primer RDl mic flanking region R 
GATTGAGACACTTGCCACGA (SEQ ID No 12) 

primer plcA.int.F 

CAAGTTGGGTCTGGTCGAAT (SEQ ID No 13) 
primer plcA.int.R 

GCTACCCAAGGTCTCCTGGT (SEQ ID No 14) ) 

CCT (SEQ ID No 15) 

primer RDS* 10 flanking region F 

GAAXGCCGACGTCATATCG (SEQ ID NO 16) 

primer RDS** 0 flanking region R 

CGGCCACTGAGTTCGATTAT (SEQ ID No 17) 

AC (SEQ ID No 18) 

primer MiDl flanking region F 

CRGCCAACRCCAAGTAGACG (SEQ ID NO 19) 



primer MiDl flanking region R 

TCTACCTGCAGTCGCTTGTG (SEQ ID NO 20) 



Sequence at the junction MiDl 

CACCTC^CMC^rcCCATCCra 

CATCTTCTGGCAGATCAGCAGATCGCTTGTTCTCAGTGCAGGTGAGTC (SEQ ID No 21) 

primer MiD2 flanking region R 

GTCCATCGAGGATGTCGAGT (SEQ ID NO 22) 

primer MiD2 flanking region L 

CTAGGCCATTCCGTTGTCTG <SEQ ID NO 23) 

Sequence at the junction MiD2 

GCTGCCTACOT^GCTCAACGCCAGAGACCa^ 
C (SEQ ID NO 24) 

primer MiD3 flanking region R 

GGCGACGCCATTTCC (SEQ ID NO 25) 

-pr^i^ex--MrD3^1^nJcing region"! : 

AACTGTCGGGCTTGCTCTT (SEQ ID NO 26) 

Sequence at the junction MiD3 

TGGCGCCGGCACCTCC^TTGCCACCGTT^ 
GGGCXX^TTCGCCCTCAGCCGCTAAACACGCCGACCAAGATC 
CCCCCATATCAGCAAGGGCCCGGTGTCGGCG (SEQ ID No 27) 
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